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Abstract.  In  this  work  we  first  study  in  detail  the  formulation  of  the  primal-dual  interior- 
point  method  for  linear  programming.  We  show  that,  contrary  to  popular  belief,  it  cannot 
be  viewed  as  the  damped  Newton  method  applied  to  the  Karush-Kuhn- Tucker  conditions 
for  the  logarithmic  barrier  function  problem.  Next  we  extend  the  formulation  to  general 
nonlinear  programming,  and  then  validate  this  extension  by  demonstrating  that  this  algo¬ 
rithm  can  be  implemented  so  that  it  is  locally  and  Q-quadratically  convergent  under  only 
the  standard  Newton’s  method  assumptions.  We  also  establish  a  global  convergence  theory 
for  this  algorithm  and  include  promising  numerical  experimentation. 

Key  Words.  Interior-point  methods,  primal-dual  methods,  nonlinear  programming,  super- 
linear  and  quadratic  convergence,  global  convergence. 
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1  Introduction 


Motivated  by  the  impressive  computational  performance  of  primal-dual  interior-point  meth¬ 
ods  for  linear  programming  (see  for  example  Lustig,  Marsten,  and  Shanno  (Ref.  1)),  it  is 
natural  that  researchers  have  directed  their  attention  to  the,  generally  more  difficult,  area 
of  nonlinear  programming.  Recently  there  has  been  considerable  activity  in  the  area  of 
interior-point  methods  for  quadratic  and  convex  programming.  We  shall  not  attempt  to 
list  these  research  efforts,  and  restrict  our  attention  to  interior-point  methods  for  nonconvex 
programming.  In  the  area  of  barrier  methods  we  mention  M.  Wright  (Ref.  2)  and  Nash 
and  Sofer  (Ref.  3).  S.  Wright  (Ref.  4)  considers  the  monotone  nonlinear  complementar¬ 
ity  problem  and  Monteiro,  Pang,  and  Wang  (Ref.  5)  consider  the  nonmonotone  nonlinear 
complementarity  problem.  S.  Wright  (Ref.  6)  considered  the  linearly  constrained  nonlinear 
programming  problem.  Lasdon,  Yu,  and  Plummer  (Ref.  7)  considered  various  interior-point 
method  formulations  for  the  general  nonlinear  programming  problem.  An  algorithm  and 
corresponding  theory  was  given  by  Yamashita  (Ref.  8).  Other  work  in  the  area  of  interior- 
point  methods  for  nonlinear  programming  include  McCormick  (Ref.  9),  Anstreicher  and  Vial 
(Ref.  10),  Kojima,  Megiddo,  and  Noma  (Ref.  11),  and  Monteiro  and  Wright  (Ref.  12). 

The  primary  objective  of  this  paper  is  to  carry  over  from  linear  programming  a  viable 
formulation  of  an  interior-point  method  for  the  general  nonlinear  programming  problem. 
In  order  to  accomplish  this  objective,  we  first  study  in  extensive  detail  the  formulation  of 
the  highly  successful  Kojima-Mizuno-Yoshise  (Ref.  13)  primal-dual  interior-point  method  for 
linear  programming.  It  has  been  our  basic  perception  that  the  fundamental  ingredient  in  this 
formulation  is  the  perturbed  Karush-Kuhn-Tucker  conditions  and  the  relationship  between 
these  conditions  and  logarithmic  barrier  function  method  has  not  been  clearly  delineated. 
Hence  Sections  2-4  are  devoted  to  this  concern.  Of  particular  interest  in  this  context  is 
Proposition  2.3  which  shows  that  Newton’s  method  applied  to  the  Karush-Kuhn-Tucker 
conditions  for  the  logarithmic  barrier  function  formulation  of  the  primal  linear  program 
and  Newton’s  method  applied  to  the  perturbed  Karush-Kuhn-Tucker  conditions  (i.e.  the 
Kojima-Mizuno-Yoshise  primal-dual  method)  never  coincide. 

In  Section  4  we  state  what  we  consider  to  be  a  basic  formulation  of  an  interior-point 
method  for  the  general  nonlinear  programming  problem.  The  viability  of  this  formulation 
is  reinforced  by  the  local  theory  developed  in  Section  5.  Here  we  demonstrate  that  local, 
superlinear,  and  quadratic  convergence  can  all  be  obtained  for  the  interior-point  method, 
under  exactly  the  conditions  needed  for  the  standard  Newton’s  method  theory.  The  global 
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convergence  theory  is  the  subject  of  Section  6.  In  Section  7  we  present  some  preliminary 
numerical  experimentation  using  the  2-norm  of  the  residual  as  our  merit  function.  Finally 
in  Section  8  we  give  some  concluding  remarks. 

The  choice  of  merit  function  for  interior-point  methods  is  not  a  focus  of  the  current 
research.  Such  activity  is  of  importance  and  merits  further  investigation.  Our  globalization 
theory  conveniently  and  effectively  uses  the  2-norm  of  the  residual  as  merit  function.  At  the 
very  least  it  can  be  viewed  as  a  demonstration  of  the  viability  of  such  theory  for  general 
interior-point  methods. 

2  Interpretation  of  the  LP  Formulation 

Consider  the  primal  linear  program  in  the  standard  form 

min  cTx 
s.t.  Ax  =  b 
x  >  0 

where  c,  x  €  Rn,  b  €  Rm,  and  A  €  RmXn.  The  dual  linear  program  can  be  written 

max  bTy 
s.t.  ATy  +  z  =  c 
z>  0 

and  2  (E  Rn  is  called  the  vector  of  dual  slack  variables. 

Basic  Assumption:  The  matrix  A  has  full  rank. 

As  is  done  in  this  area,  we  use  X  to  denote  the  diagonal  matrix  with  diagonal  x  and 
employ  analogous  notation  for  other  quantities.  Also  e  is  a  vector  of  all  ones  whose  dimension 
will  vary  with  the  context. 

A  point  x  €  Rn  is  said  to  be  strictly  feasible  for  problem  (1)  if  it  is  both  feasible  and 
positive.  A  point  z  (E  Rn  is  said  to  be  feasible  for  problem  (2)  if  there  exists  y  6  Rm  such 
that  (y,z)  is  feasible  for  problem  (2).  Moreover,  z  (or  ( y,z ))  is  said  to  be  strictly  feasible 
(for  problem  (2))  if  it  is  feasible  and  z  is  positive.  A  pair  (x,z)  is  said  to  be  on  the  central 
path  (at  p  >  0)  if  x,z,  =  p  for  all  i ,  and  x  is  feasible  for  problem  (1),  and  z  is  feasible  for 
problem  (2).  We  also  say  that  x  is  on  the  central  path  (at  p  >  0)  if  (x,pX~le)  is  on  the 
central  path,  i.e.,  if  pX~le  is  feasible  for  problem  (2). 


(la) 

(lb) 

(lc) 

(2a) 

(2b) 

(2c) 
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The  first-order  or  Karush-Kuhn-Tucker  (KKT)  optimality  conditions  for  problem  (1)  are 

'  Ax  —  b  ^ 

F(x,  y,  z)  =  ATy  +  z-c  =0,  (x,z)>0.  (3) 

^  XZe  j 

By  the  perturbed  KKT  conditions  for  problem  (1)  we  mean 

'  Ax  —  b  ^ 

Fnix,y,z)=  ATy  +  z-c  =0,  (x,z)>0,  fi  >  0.  (4) 

^  XZe  —  fie  f 

Observe  that  the  perturbation  is  made  only  to  the  complementarity  equation.  Fiacco 
and  McCormick  (Ref.  14)  were  probably  the  first  to  consider  the  perturbed  KKT  conditions. 
They  did  so  in  the  context  of  the  general  inequality  constrained  nonlinear  programming  prob¬ 
lem.  They  made  several  key  observations  including  the  fact  that  the  sufficiency  conditions 
for  the  unconstrained  minimization  of  the  logarithmic  barrier  function  were  implied  locally 
by  the  perturbed  KKT  conditions  and  the  standard  second-order  sufficiency  conditions. 

In  1987  Kojima,  Mizuno,  and  Yoshise  (Ref.  13)  proposed  the  now  celebrated  primal-dual 
interior-point  method  for  linear  programming.  In  essence,  their  algorithm  is  damped  New¬ 
ton  applied  to  the  perturbed  KKT  conditions  (4).  These  authors  state  that  their  algorithm 
is  based  on  Megiddo’s  (Ref.  15)  work  concerning  the  classical  logarithmic  barrier  function 
method.  This  pioneering  work  of  Kojima,  Mizuno,  and  Yoshise  has  motivated  considerable 
research  activity  in  the  general  area  of  primal-dual  interior-point  methods  for  linear  pro¬ 
gramming,  quadratic  programming,  convex  programming,  linear  complementarity  problems, 
and  some  activity  in  general  nonlinear  programming.  However,  the  relationship  between  the 
perturbed  KKT  conditions  and  the  logarithmic  barrier  function  problem  seems  not  to  have 
been  well  articulated  and  is  often  misstated.  Therefore,  we  will  rigorously  pursue  a  study  of 
this  relationship. 

Our  intention  is  to  demonstrate  the  following.  While  the  perturbed  KKT  conditions  are 
in  an  obvious  sense  equivalent  to  the  KKT  conditions  for  the  logarithmic  barrier  function 
problem,  they  are  not  the  KKT  conditions  for  this  problem  or  for  any  other  unconstrained  or 
equality  constrained  optimization  problem.  Furthermore,  the  primal-dual  Newton  interior- 
point  method  cannot  be  viewed  as  Newton’s  method  applied  to  the  KKT  conditions  for  the 
logarithmic  barrier  function  problem;  indeed  these  latter  iterates  and  the  former  iterates 
never  coincide.  Towards  this  end  we  begin  by  considering  the  logarithmic  barrier  function 
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problem  associated  with  problem  (1) 


min  cTx  —  fj,  £"=1  log(x,)  (5a) 

s.t.  Ax  =  b  (5b) 

x  >  0  (5c) 

for  a  fixed  [i  >  0.  The  KKT  conditions  for  problem  (5)  are 

(AT»+^-C)-0,  X>0  (6) 

Proposition  2.1  The  perturbed  KKT  conditions  for  problem  (1)  given  by  (4),  and  the  KKT 
conditions  for  the  logarithmic  barrier  function  problem  (5)  given  by  (6)  are  equivalent  in  the 
sense  that  they  have  the  same  solutions,  i.e.,  F^(x,  y)  =  0  if  and  only  if  F^x,  y ,  y,X~le)  =  0. 

Proof:  The  proof  is  straightforward.  □ 

In  spite  of  the  equivalence  described  in  Proposition  2.1,  we  have  the  following  anomaly. 

Proposition  2.2  The  perturbed  KKT  conditions  for  problem  (1),  i.e.,  F^(x,y,z)  —  0, 
or  any  permutation  of  these  equations,  are  not  the  KKT  conditions  for  the  logarithmic 
barrier  function  problem  (5)  or  any  other  (smooth)  unconstrained  or  equality  constrained 
optimization  problem. 

Proof:  If  F ( x,y,z )  =  0  were  the  KKT  conditions  for  some  equality  constrained  optimiza¬ 
tion  problem  we  would  have  that  there  exists  a  Lagrangian  function  L  such  that 

VL(x,y,z)  =  F^x,y,z). 

It  would  then  follow  that 

V2L(x,y,z)  =  F'^(x,  y,  z). 

However  V2L(x,  y,  z)  is  a  Hessian  matrix  and  is  therefore  symmetric.  But  direct  calculations 
show  that  F'^(x,y,z)  or  any  permutations  of  its  rows  is  not  symmetric.  This  argument  also 
excludes  unconstrained  optimization  problems.  □ 

We  tacitly  assumed  that  L(x,y,z)  in  the  proof  of  Proposition  2.2  was  of  class  (72. 
Observe  that  the  perturbed  KKT  conditions  (4)  are  obtained  from  (6),  the  KKT  condi¬ 
tions  for  the  logarithmic  barrier  function  problem,  by  introducing  the  auxiliary  variables 

2  =  y.X~le 
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and  then  expressing  these  nonlinear  defining  relations  in  the  form 


XZe  =  ye. 

Considerably  more  will  be  said  about  this  nonlinear  transformation  in  Section  3.  We  now 
demonstrate  exactly  how  nonlinear  this  transformation  is  by  showing  that  the  equivalence 
depicted  in  Proposition  2.1  in  no  way  carries  over  to  a  Newton  algorithmic  equivalence. 
Certainly,  the  possibility  of  such  an  equivalence  is  not  precluded  by  Proposition  2.2  alone. 

Proposition  2.3  Consider  a  triple  ( x ,  y,  z )  such  that  x  is  strictly  feasible  for  problem  (1)  and 
(y,z)  is  strictly  feasible  for  problem  (2).  Let  (Ax,  Ay,  Az)  denote  the  Newton  correction  at 
(x,y,z)  obtained  from  the  nonlinear  system  F^(x,y,z)  =  0  given  by  (4).  Also  let  (Ax',  Ay') 
denote  the  Newton  correction  at  ( x,y )  obtained  from  the  nonlinear  system  FM(x,y)  =  0 
given  by  (6).  Then  the  following  are  equivalent: 

(i)  (Ax,  Ay)  =  (Ax',  Ay') 

(ii)  Ax  =  0 

(iii)  Ax'  =  0 

(iv)  x  is  on  the  central  path  at  y 

Proof:  The  two  Newton  systems  that  we  are  concerned  with  are 


ATAy'  -  yX~2Ax'  =  -ATy-  yX-'e  +  c  (7a) 

AAx  =  0  (7b) 

and 

ATAy  +  Az  =  0  (8a) 

AAx  =  0  (8b) 

ZAx  +  XAz  =  —  XZe -\- ye  (8c) 


These  two  linear  systems  have  unique  solutions  under  the  assumptions  that  (x,z)  >  0 
and  the  matrix  A  has  full  rank.  We  briefly  outline  a  proof  for  (7).  A  proof  for  (8)  is  only 
slightly  more  difficult.  Consider  the  homogeneous  system 

rAT Ay'  -  yX~2  Ax'  =  0 
AAx'  =  0 
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(9a) 

(9b) 


If  we  multiply  the  first  equation  of  (9)  by  AX2  and  use  the  second  equation  we  obtain 


(AX2AT)Ay'  =  0. 

Moreover  AX2AT  is  invertible  under  our  assumptions.  Hence  Ay'  =  0  and  therefore  from 
(9)  Ax'  =  0.  This  implies  that  our  system  has  a  unique  solution. 

Proof  of  (i)  =>•  (ii) 

Solving  the  last  equation  of  (8)  for  A z,  substituting  in  the  first,  and  observing  that  by 
feasibility  z  =  c  —  ATy  leads  to 

AT  Ay  —  X~l  ZAx  =  —ATy  —  y,X~le  +  c. 

Comparing  the  last  equation  with  the  first  in  (7)  gives 

XZAx  =  pAx'.  (10) 

From  the  first  two  equations  in  (8)  we  see  that  AxTAz  =  0,  i.e., 

Axi Azi  -f  . . .  +  AxnAzn  =  0.  (11) 

Define  a  subset  /  of  {1, . . .  ,n}  as  follows.  The  index  i  €  /  if  and  only  if  Ax,  ^  0.  Now  by 
way  of  contradiction  suppose  that  /  is  not  empty.  From  the  last  equation  in  (8)  and  (10)  we 
have  that 

Z,Ax;  -f  x,Az,-  =  0  for  !  6  /. 

Since  Zi  >  0  and  x,  >  0,  the  last  equation  implies  that  Ax,  and  A z,-  are  both  not  zero  and 
are  of  opposite  sign.  However,  this  contradicts  (11).  This  is  the  contradiction  that  we  were 
searching  for  and  we  may  now  conclude  that  I  is  empty.  Hence  Ax  =  0  and  we  have  shown 
that  (i)  (ii). 

Proof  of  (ii)  =$>  (iii) 

Suppose  that  Ax  =  0.  Then  from  the  first  and  third  equation  in  (8)  we  see  that 

ATAy  =  z  —  y,X~xe. 


Hence  (0,  Ay)  also  solves  (7). 
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Proof  of  (iii)  =>•  (iv) 

If  Ax'  =  0,  then  from  the  first  equation  in  (7) 

AT(y  +  Ay')  +  nX~le  -  c  =  0. 

Therefore  /j,X~1e  is  strictly  feasible  for  problem  (2).  This  says  that  x  is  on  the  central  path 
at  n. 

Proof  of  (iv)  (i) 

Suppose  that  x  is  on  the  central  path.  This  means  that  /iX-1e  is  feasible  for  problem  (2), 
i.e.,  there  exists  y  such  that  ( y,fj,X~1e )  is  feasible  for  problem  (2).  It  follows  that  (0,y  —  y) 
solves  (7).  Also  (0,  y  —  y,y,X~le  —  z)  solves  (8).  Consequently  (Ax',  Ay')  =  (Ax,  Ay)  and 
we  have  established  that  (iv)  (i),  and  finally  the  proposition.  □ 

Remark  2.1  Proposition  2.3  is  extremely  restrictive.  It  is  incorrect  to  interpret  it  as  saying 
that  the  two  Newton  iterates  agree  only  if  the  current  x  is  on  the  central  path.  It  says  that 
these  iterates  agree  if  and  only  if  there  is  no  movement  in  x.  This  characterizes  the  redundant 
situation  when  x  is  on  the  central  path  at  \i  and  we  are  trying  to  find  an  x  which  is  on  the 
central  path  at  //.  If  x  is  on  the  central  path  at  //  and  we  are  trying  to  find  a  point  on  the 
central  path  at  ft  /  /t,  then  the  two  Newton  iterates  will  not  generate  (Ax,  Ay)  =  (Ax',  Ay'). 
Simply  stated,  the  two  Newton  iterates  never  coincide. 

3  Interpretation  of  the  Perturbed  KKT  Conditions 

There  is  a  philosophical  parallel  between  the  modification  of  the  penalty  function  method 
that  leads  to  the  multiplier  method  and  the  modification  of  the  KKT  conditions  for  the 
logarithmic  barrier  function  problem  that  leads  to  the  perturbed  KKT  conditions.  The 
similarity  is  that  both  modifications  introduce  an  auxiliary  variable  to  serve  as  approximation 
to  the  multiplier  vector  and  use  this  as  a  vehicle  for  removing  inherent  ill-conditioning  from 
the  formulation.  However,  the  roles  that  the  two  auxiliary  variables  play  in  the  removal  of 
inherent  ill-conditioning  are  quite  different.  We  believe  that  this  parallel  adds  prospective 
to  the  role  of  the  perturbed  KKT  conditions  and  therefore  pursue  it  in  some  detail.  The 
following  comments  are  an  attempt  to  shed  understanding  on  the  perturbed  KKT  conditions 
and  are  not  intended  to  be  viewed  as  mathematical  theory. 

For  the  sake  of  simplicity  our  constrained  problems  will  have  only  one  constraint.  And 
for  the  sake  of  illustration  the  multiplier  associated  with  this  constraint  will  be  nonzero  at 
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the  solution.  The  amount  of  smoothness  required  is  not  an  issue  and  all  functions  will  be  as 
smooth  as  the  context  requires. 

Consider  the  equality  constrained  optimization  problem. 

min  /(x ) 
s.t.  h(x)  =  0 

where  /,  h  :  Rn  — >  R.  The  KKT  conditions  for  problem  (12)  are 

V/(x)  +  AV/i(x)  =  0, 
h(x)  =  0. 

The  ^-penalty  function  associated  with  problem  (12)  is 

P(x;p)  =  f(x)  +  ^ h(x)Th(x ). 

The  gradient  of  P  is  given  by 

VP(x;  p)  =  V  f(x)  +  ph(x)Vh{x) 
and  the  Hessian  of  P  is  given  by 

V2P(x;p)  =  V2/(x)  +  ph(x)V2h(x)  -f  pV h(x)V h(x)T . 

The  penalty  function  method  consists  of  the  generation  of  the  sequence  {x*;}  defined  by 

x*  =  argmin  P(x-,pk). 

Suppose  that  Xk  — ♦  x*,  a  solution  of  (12),  and  let  A*  be  the  associated  multiplier.  Then 
we  must  have  Pkh(xk)  — >  A*.  Since  h(xk )  — ►  0,  and  we  are  assuming  that  A*  ^  0,  necessarily 
Pk  —*  +oo.  However  as  pk  — ♦  +oo  the  conditioning  of  the  Hessian  matrix  V2P(xjt;pfc) 
becomes  arbitrarily  bad.  The  problem  here  is  that  we  are  asking  too  much  from  the  penalty 
parameter  pk .  We  are  asking  it  to  contribute  to  good  global  behavior  by  penalizing  constraint 
violation  and  we  are  asking  it  to  contribute  to  good  local  behavior  by  forcing  pkh(xk)  to 
approximate  the  multiplier.  Hestenes  (Ref.  16)  in  1969  proposed  a  way  of  circumventing 
the  conditioning  deficiency.  He  introduced  an  auxiliary  variable  A  and  replaced  ph{x)  in 
(14)  with  A  +  ph(x).  This  modification  effectively  converts  the  penalty  function  into  the 
augmented  Lagrangian.  The  role  of  the  auxiliary  variable  A  estimating  the  multiplier  was 


(13a) 

(13b) 


(12a) 

(12b) 
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relegated  to  that  of  parameter  in  that  A*  was  held  fixed  during  a  minimization  phase  of 
the  augmented  Lagrangian  for  the  determination  of  x*  and  then  updated  according  to  the 
formula  \k+i  =  A k+Pkh^k).  In  this  way  the  role  of  pkh(xk)  is  no  longer  one  of  estimating  the 
multiplier,  but  one  of  estimating  the  correction  to  the  multiplier.  Hence  it  is  most  appropriate 
for  pkh(xk)  — ►  0  and  the  requirement  that  pk  — >  +oo  is  no  longer  necessary.  The  multiplier 
method  has  enjoyed  considerable  success  in  the  computational  sciences  marketplace. 

Now,  consider  the  inequality  constrained  optimization  problem 


min  f(x)  (15a) 

s.t.  #(x)  >  0  (15b) 

where  /,  g  :  R"  — *■  R.  The  KKT  conditions  for  this  problem  are 

V  f(x)  —  zV  g{x)  =  0  (16a) 

zg(x)  =  0  (16b) 

g(x)  >  0  (16c) 

z  >  0.  (16d) 


The  logarithmic  barrier  function  associated  with  problem  (15)  is 

B{x]p)  =  f(x)  -  p,  log($(x)),  p>0. 

The  gradient  of  B  is  given  by 

VR(x;  p)  =  Vf(x) - ~~yg{x)-, 

9KX) 

and  the  Hessian  of  B  is  given  by 

V!B(x;  h)  =  VV(x)  -  ^>9(x)  +  -J^Vg(x)Vg(x)T. 

The  logarithmic  barrier  function  method  consists  of  generating  a  sequence  of  iterates 
{xt}  as  solutions  of  the  essentially  unconstrained  problem 

min  B(x-,pk )  (17a) 

s.t.  g(x)  >  0.  (17b) 

Suppose  that  the  constraint  g(x)  is  binding  at  a  solution  x*  of  problem  (15).  As  before 
we  see  that  convergence  of  {x*}  to  x*  requires  that  p,k/g(xk)  z*,  where  z*  is  the  mul¬ 
tiplier  associated  with  the  solution  x*.  Since  pk/g(xk )  — >  z*  and  g(xk)  — ►  0  we  see  that 
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gk/g(xk)2  — ►  +oo  and  the  Hessian  of  the  logarithmic  barrier  function  becomes  arbitrarily 
badly  conditioned.  As  in  the  case  of  the  penalty  function  method  we  are  asking  the  penalty 
parameter  sequence  (barrier  parameter  sequence  in  this  case)  to  do  too  much  and  the  price 
once  again  is  inherent  ill-conditioning.  Now  introduce  the  auxiliary  variable  2  =  p/g(x)  and 
write  this  defining  relationship  in  the  benign  form  zg(x )  =  p,  so  that  differentiation  will  not 
create  ill-conditioning. 

In  this  fashion  the  KKT  conditions  for  the  logarithmic  barrier  function  problem  (17); 
namely 

V/(ar)  -  (p/g(x))Vg(x)  =  0 
g(x)  >  0 

are  transformed  into  the  perturbed  KKT  conditions 

V/(z)  -  zVglx)  =  0 
zg(x)  =  p 
g(x)  >  0 

as  proposed  and  discussed  in  Fiacco  and  McCormick  (Ref.  14). 

We  now  summarize.  In  the  penalty  function  method  the  quantity  ph(x)  must  approximate 
the  multiplier,  necessitating  pk  — ►  +00.  Hence  the  derivative  of  ph(x)  becomes  arbitrarily 
large  leading  to  arbitrarily  bad  conditioning  of  the  Hessian  matrix.  On  the  other  hand  in 
the  logarithmic  barrier  function  method  the  quantity  p/g(x)  must  approximate  the  multi¬ 
plier.  Hence  p  cannot  go  to  zero  too  fast  and  the  derivative  of  p/g(x)  becomes  arbitrarily 
large  leading  to  arbitrarily  bad  conditioning  of  the  Hessian  matrix.  In  the  former  case  the 
difficulty  arises  from  the  fact  that  p  — >  +00.  The  introduction  of  the  auxiliary  variable  A 
in  the  multiplier  method  allows  one  to  remove  this  requirement;  hence  the  removal  of  ill- 
conditioning.  In  the  latter  case  the  difficulty  arises  from  the  differentiation  of  the  functional 
form  p/g(x).  The  introduction  of  the  auxiliary  variable  z  allows  one  to  change  the  func¬ 
tional  form  so  that  differentiation  no  longer  leads  to  ill-conditioning.  Hence,  while  there  is 
certainly  a  philosophical  similarity  between  the  two  approaches,  there  is  no  doubt  that  the 
latter  is  more  satisfying  and  mathematically  more  elegant.  While  this  transformation  seems 
rather  straightforward,  we  stress  that  it  leads  to  significant  changes,  i.e.  the  removal  of 
ill-conditioning  and  the  effect  of  Proposition  2.3.  The  main  point  of  the  current  discussion  is 
to  focus  on  the  similarity  between  the  multiplier  methods  as  a  vehicle  for  removing  inherent 
ill-conditioning  from  the  penalty  function  method  and  the  perturbed  KKT  conditions  as  a 
vehicle  for  removing  inherent  ill-conditioning  from  the  logarithmic  barrier  function  problem. 
The  extent  to  which  ill-conditioning  is  reflected  in  computation  is  not  a  discussion  issue  here. 
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It  is  perhaps  of  interest  to  point  out  that  the  auxiliary  variable  z  estimating  the  multi¬ 
plier  can  be  introduced  in  a  logical  fashion  from  a  logarithmic  barrier  function  formulation. 
Towards  this  end  consider  the  slack  variable  form  of  problem  (15) 


min  f(x)  (18a) 

s.t.  g(x)  —  s  =  0  (18b) 

a  >  0.  (18c) 

The  KKT  conditions  for  this  problem  are 

V  f(x)  —  zVg(x )  =  0  (19a) 

z  —  w  =  0  (19b) 

g(x)  —  s  =  0  (19c) 

ws  =  0  (19d) 

(w,s)  >  0.  (19e) 


The  system  (19)  is  equivalent,  and  Newton  algorithmically  equivalent,  to  the  system 

V/(a:)  —  zVg(x)  =  0 
g(x)  -5  =  0 
zs  —  0 
(s,z)  >  0. 

The  logarithmic  barrier  function  problem  for  (18)  is 

min  f(x)  -  g,  log(s) 
s.t.  g(x)  —  5  =  0 
(5  >  0). 

The  KKT  conditions  for  (21)  are 


V/(x)  —  zVg(x)  =  0 

(22a) 

z-(n/s)  =  0 

(22b) 

g(x)  -5  =  0 

(22c) 

5  >  0 

(22d) 

z  >  0 

(22e) 

(21a) 

(21b) 

(21c) 


(20a) 

(20b) 

(20c) 

(20d) 
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By  writing  z  —  fi/s  =  0  as  sz  =  ji  in  (22)  we  arrive  at  the  perturbed  version  of  the  KKT 
conditions  (20).  Once  more  we  stress  that  such  a  transformation  gives  an  equivalent  problem, 
removes  inherent  ill-conditioning,  but  does  not  preserve  Newton  algorithmic  equivalence  (see 
Proposition  2.3).  What  we  have  witnessed  here  is  the  following.  The  pure  logarithmic  barrier 
function  method  deals  with  an  unconstrained  problem.  Hence  there  are  no  multipliers  in 
the  formulation.  However,  if  we  first  add  nonnegativity  slack  variables,  then  the  logarithmic 
barrier  function  problem  is  an  equality  constrained  problem  and  therefore  the  corresponding 
first-order  conditions  involve  multipliers. 

We  now  briefly  motivate  the  perturbed  KKT  conditions  in  a  manner  that  has  nothing 
to  do  with  the  logarithmic  barrier  function.  Consider  the  complementarity  equation  for 
problem  (1) 

=  0. 

In  any  Newton’s  method  formulation  we  deal  with  linearized  complementarity 

ZAx  +  XAz  =  -XZe.  (23) 

Linearized  complementarity  leads  to  several  remarkable  algorithmic  properties.  This  was 
observed  by  Tapia  (Ref.  17)  in  1980  for  the  general  nonlinear  programming  problem  and  was 
developed  and  expounded  by  El-Bakry,  Tapia,  and  Zhang  (Ref.  18)  for  the  application  of  the 
primal-dual  interior-point  methods  to  linear  programming.  In  spite  of  its  local  strengths, 
globally,  linearized  complementarity  has  a  serious  flaw.  It  forces  iterates  to  stick  to  the 
boundary  of  the  feasible  region  once  they  approach  that  boundary.  That  is,  if  a  component 
[xfc],  of  a  current  iterate  becomes  zero  and  [zk]i  >  0,  then  from  the  linearized  complemen¬ 
tarity  equation  (23)  we  see  that  [x;]t  =  0  for  all  l  >  k,  i.e.,  this  component  will  remain 
zero  in  all  future  iterations.  The  analogous  situation  is  true  for  the  z  variable.  Such  an 
undesirable  attribute  clearly  precludes  the  global  convergence  of  the  algorithm.  An  obvious 
correction  is  to  modify  the  Newton  formulation  so  that  zero  variables  can  become  nonzero  in 
subsequent  iterations.  This  can  be  accomplished  by  replacing  the  complementarity  equation 
XZe  =  0  with  perturbed  complementarity  X Ze  =  fie  (ft  >  0).  Of  course  this  is  exactly  the 
introduction  of  the  notion  of  adherence  to  the  central  path.  It  is  known  that  such  adherence 
tends  to  keep  the  iterates  away  from  the  boundary  and  promotes  the  global  convergence  of 
the  Newton  interior-point  method.  It  is  this  central  path  interpretation  that  we  feel  best 
motivates  the  perturbed  KKT  conditions. 


14 


4  Nonlinear  Programming  Formulation 


In  this  section  we  formulate  the  primal-dual  Newton  interior-point  method  for  the  general 
nonlinear  programming  problem.  Our  approach  will  be  to  consider  damped  Newton  applied 
to  the  perturbed  KKT  conditions.  In  order  to  fully  imitate  the  formulation  used  in  the  linear 
programming  case  we  will  transform  inequalities  into  equalities  by  adjoining  nonnegative 
slack  variables. 

Consider  the  general  nonlinear  programming  problem 

min  f(x)  (24a) 

s.t.  h(x)  =  0  (24b) 

g(x)  >  0  (24c) 

where  /  :  R"  — >  R,  h  :  Rn  — >  Rm  (m  <  n),  and  g  :  R"  — »  Rp.  The  Lagrangian  associated 
with  problem  (24)  is 

L(x,  y,  z)  =  f(x)  +  yTh(x)  -  zTg(x). 

If  x  is  feasible  for  problem  (24),  then  we  let  B(x)  denote  the  set  of  indices  of  binding  inequality 
constraints  at  x,  i.e., 

B{x)  =  {i  :gi(x)  =  0 ,*  =  l,...,p>. 

The  KKT  conditions  for  problem  (24)  are 


VxL(x,y,z)  =  0 

(25a) 

h(x)  =  0 

(25b) 

g(x)  >  0 

(25c) 

Zg(x)  =  0 

(25d) 

2  >  0  , 

(25e) 

where  VxL(x,  y,  z)  =  V f(x)  +  Vh(x)y  -  Vg(x)z. 

The  standard  Newton’s  method  assumptions  for  problem  (24)  are 

(Al)  Existence.  There  exists  (x*,y*,z*),  solution  to  problem  (24)  and  associated  multipli¬ 
ers,  satisfying  the  KKT  conditions  (25). 

(A2)  Smoothness.  The  Hessian  matrices  V2/(ar),  V2A,(x),  V2#,-^)  for  all  i  exist  and  are 
locally  Lipschitz  continuous  at  x* . 
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(A3)  Regularity.  The  set  {V/ij(x*), . . . ,  V&m(x*)}  U  { V<7,(x*)  :  i  6  B(x*)}  is  linearly  inde¬ 
pendent. 

(A4)  Second-order  Sufficiency.  For  all  77  ^  0  satisfying  V/ij(x*)Tr/  =  0,  i  =  and 

Vgi(x*)TTi  =  0,  i  G  we  have  t]TV  x2  L(x*)r]  >  0. 

(A5)  Strict  Complementarity.  For  all  i,  z[  -f  p,(x*)  >  0. 

The  KKT  conditions  (25)  can  be  written  in  slack  variable  form  as 

(  VxL(x,y,z)  \ 


F(x,y,s,z)  = 


h(x) 
g(x)  -  s 


=  0,  (s,  z)  >  0. 


(26) 


V 


ZSe 


) 


The  following  proposition  is  fundamental  to  our  work 


Proposition  4.1  Let  conditions  (Al)  and  (A2)  hold.  Also  let  s*  =  g(x*).  The  following 
statements  are  equivalent: 

(i)  Conditions  (A3)-(A5)  also  hold. 

(ii)  The  Jacobian  matrix  F'(x*,y*,s*,z*)  of  F(x,y,s,z)  in  (26)  is  nonsingular. 


Proof:  Such  an  equivalence  is  reasonably  well-known  for  the  equality  constrained  optimiza¬ 
tion  problem.  Hence  we  base  our  proof  on  that  equivalence.  To  begin  with  observe  that 


F\x\y*,s\z*)  = 


(  VlL.  Vh(x‘)  -Vg(x-)  0  \ 

Vh(x*)T  0  0  0 

Vy(x*)T  0  0-7 

0  0  S*  Z* 


(27) 


where  =  V^L(y*,  x*,  z*).  Consider  the  equality  constrained  optimization  problem 


min  /(x) 
s.t.  h(x)  =  0 

gi(x)  =  0,  i  e  B(x*)  . 

Observe  that  the  regularity  condition  (A3)  is  regularity  for  this  problem  and  the  second- 
order  sufficiency  condition  (A4)  is  second-order  sufficiency  for  this  problem.  Hence  from  the 
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theory  of  equality  constrained  optimization  we  see  that  (A3)  and  (A4)  are  equivalent  to  the 
nonsingularity  of  the  matrix 


F'(x\y*,s*,z*)  = 


(  V2XL.  Vh(x*) 
Vh(x*)T  0 

^  vk**)t  o 


0 

0  / 


where  Vg(x*)  is  the  matrix  whose  columns  are  {V<7;(x*)  :  i  €  #(x*)}.  It  is  not  difficult 
to  see  that  the  nonsingularity  of  (27)  is  equivalent  to  strict  complementarity  (A5)  and  the 
nonsingularity  of  F'(x*,y*,  s*,  z*)  .  □ 

We  loose  a  small  amount  of  flexibility  by  adding  slack  variables  to  the  KKT  conditions 
(25)  and  then  working  with  the  resulting  system  (26),  instead  of  adding  slack  variables 
directly  to  the  optimization  problem  (24)  and  then  working  with  the  resulting  KKT  condi¬ 
tions.  This  small  observation  is  quite  subtle;  but  will  play  a  role  in  the  formulation  of  our 
interior-point  method.  Hence  we  now  pursue  it  in  some  detail. 

Consider  the  following  equivalent  slack  variable  form  of  problem  (24) 


min  f(x) 

(28a) 

s.t.  h(x )  =  0 

(28b) 

g(x)  -s  =  0 

(28c) 

s  >  0  . 

(28d) 

The  KKT  conditions  for  problem  (28)  are 

V f(x)  -f  V/i(x)y  —  Vg(x)w  =  0 

(29a) 

w  —  z  —  0 

(29b) 

o 

II 

'TT 

(29c) 

g(x)  -s  =  0 

(29d) 

ZSe  =  0 

(29e) 

(s,  z)  >  0. 

(29f) 

The  equation  w  —  z  =  0  in  (29)  says  that  at  the  solution  the  multipliers  associated  with 
the  equality  constraints  g(x )  —  s  =  0  are  the  same  as  the  multipliers  corresponding  to  the 
inequality  constraints  s  >  0.  Moreover,  due  to  the  linearity  of  this  equation,  the  Newton 
corrections  A w  and  A z  will  also  be  the  same.  However  the  damped  Newton  step  w  +  awAw 
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and  the  damped  Newton  step  z  +  azAz  will  be  the  same  if  and  only  if  aw  =  az  (assuming 
Aw  and  Az  are  not  both  zero).  We  have  learned  from  numerical  experimentation  that  there 
is  value  in  taking  different  steplengths  for  the  w  and  z  variables.  Hence  our  interior-point 
method  will  be  based  on  (29).  In  particular  we  base  our  algorithm  on  the  perturbed  KKT 
conditions 


/  Vf(x)  +  Vh(x)y -Vg(x)w  \ 
i  w  —  z 


Fn(x,  y,  s,  w,  z)  = 


\ 


h(x) 
g(x)  -  s 
ZSe  —  ye 


=  0, 


/ 


(s,  w,  z)  >  0  . 


Proposition  4.1  readily  extends  to  F'^x,  y,  s,  w,  z). 

We  now  describe  our  primal-dual  Newton  interior-point  method  for  the  general  nonlinear 
optimization  problem  (24).  At  the  iteration,  let  Vk  =  (xk,yk,  Sk,Wk,  Zk).  We  obtain 
our  perturbed  Newton  correction  Avk  =  (Ax*,  Aj/jt,  Ask,  Awk,  Azk),  corresponding  to  the 
parameter  //*, ,  as  the  solution  of  the  perturbed  Newton  linear  system 


Fl(vk)  Av  =  —Fn(vk).  (30) 

We  allow  the  flexibility  of  choosing  different  steplengths  for  the  various  components  of  Vk- 
If  our  choice  of  steplengths  are  ax,  ay,  a,,  aw,  and  az,  we  construct  the  expanded  vector  of 
steplengths 


i  •  •  •  ,  OLx ,  Oty ,  .  .  .  ,  Oly ,  as ,  .  .  .  ,  Q!s  ,  OCw ,  .  .  .  ,  Oiw ,  OLz ,  .  .  .  ,  CXZ  ) , 


where  the  frequencies  of  occurances  of  the  steplengths  are  n,  m,  p,  p,  and  p  respectively. 
Now  we  let 

A  k  =  diag(otk),  (31) 

i.e.  A k  is  a  diagonal  matrix  with  diagonal  ak.  Hence,  the  subsequent  iterate  Ujt+i  can  be 
written  as 

Vk+i  =  vk  +  AkAv. 

Now  we  are  ready  to  state  our  generic  primal-dual  Newton  interior-point  method  for  the 
general  nonlinear  optimization  problem  (24).  For  global  convergence  consideration  a  merit 
function  4>(v),  that  measures  the  progress  towards  the  solution  v*  =  (x*,  y*,  s*,  s*,  z*),  should 
be  used. 
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Algorithm  1  (Interior-Point  Algorithm) 


Step  0  Let  vo  =  (x0,  yo,  so,  w0,  zq)  be  an  initial  point  satisfying  ( so,w0,z0 )  >  0. 

For  k  =  0, 1,2, . . do 

Step  1  Test  for  convergence. 

Step  2  Choose  >  0. 

Step  3  Solve  the  linear  system  (30)  for  Av  =  (Ax,  Ay,  As,  Aw,  A z). 

Step  4  Compute  the  quantities 

A  _  _1 

a*  —  min((S*)-1A«/c,-l) 

A.  _  _ =1 _ 

w  min((VKfc)_1  1) 

A  _  —1 _ 

°Lz  ~  min((Z*)-iA2|i,-l) 

Step  5  Choose  r*,  €  (0, 1]  and  ap  €  (0, 1]  satisfying 

<t>(vk  +  AjtAu)  <  <f>{vk)  +  t3apV <j)(vk)T Avk,  (32) 

for  some  fixed  ft  €  (0, 1),  where  A*  is  described  in  (31)  with  the  steplength  choices 

OLx  —  Otp 
OLy  —  Otp 

aa  =  min(l,rfcda) 
aw  =  min(l,Tjfedw) 
az  =  min(l,  Tkaz). 

Step  6  Set  Vk+i  =  Vk  +  A-kAvk  and  k  <—  k  +  1.  Go  to  Step  1. 


If  one  prefers  equal  steplengths  for  the  various  component  functions,  then  there  is  no 
value  in  carrying  id  as  a  separate  variable  and  it  should  be  set  equal  to  z.  Moreover,  in  this 
case  the  obvious  choice  for  the  steplength  for  the  s  and  z  components  is 

min(l,rjfcds,  r*dz).  (33) 
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It  is  a  straightforward  matter  to  employ  backtracking  on  (33)  in  order  to  satisfy  the 
sufficient  decrease  condition  (32).  Our  local  analysis  will  be  given  with  the  steplength  choice 

(33) .  A  reasonable  modification  of  this  approach  would  be  to  choose  a*  via  backtracking  and 
then  choose  ap,  the  steplength  for  the  (x,  y)— variables,  such  that  ap  >  ak  and  the  sufficient 
decrease  condition  (32)  is  still  maintained. 

5  Local  Convergence  Properties 

In  this  section  we  will  demonstrate  that  our  perturbed  and  damped  interior-point  Newton’s 
method  can  be  implemented  so  that  the  highly  desirable  properties  of  the  standard  Newton’s 
method  are  retained.  We  find  this  demonstration  particularly  satisfying  since  it  adds  credi¬ 
bility  to  our  choice  of  formulation.  The  major  issue  here  concerning  fast  convergence  is  the 
same  as  it  was  in  the  linear  programming  application.  There  it  was  dealt  with  successfully 
by  Zhang,  Tapia,  and  Dennis  (Ref.  19),  and  Zhang  and  Tapia  (Ref.  20).  This  issue  is  —  Is  it 
possible  to  choose  the  algorithmic  parameters  (percentage  of  movement  to  the  boundary) 
and  fik  (perturbation)  in  such  a  way  that  the  perturbed  and  damped  step  approaches  the 
Newton  step  sufficiently  fast  so  that  quadratic  convergence  will  be  retained  ?.  We  stress  the 
point  that  the  choice  ap  =  1  and  t*  =  1  do  not  necessarily  imply  that  the  steplength  ak  is  1. 

We  begin  by  giving  a  formal  definition  of  the  perturbed  damped  Newton’s  method  and 
then  deriving  some  facts  that  will  be  useful  concerning  the  convergence  rate  of  the  perturbed 
damped  Newton’s  method.  Towards  this  end  consider  the  general  nonlinear  equation  problem 

F(x)  =  0,  (34) 

where  F  :  Rn  — ►  R”.  Recall  that  the  standard  Newton’s  method  assumptions  for  problem 

(34)  are 

(Bl)  There  exists  x*  6  Rn  such  that  F(x*)  —  0. 

(B2)  The  Jacobian  matrix  F'(x*)  is  nonsingular. 

(B3)  The  Jacobian  operator  F'  is  locally  Lipschitz  continuous  at  x*. 

By  the  perturbed  damped  Newton’s  method  for  problem  (34)  we  mean  the  construction 
of  the  iteration  sequence 

Xk+i  =  xk  -  akF'(xk)~l[F(xk)  -  Hkp],  k  =  0,1,2,...  (35) 

where  0  <  a*  <  1,  /i*  >  0,  and  p  is  a  fixed  vector  in  Rn. 
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Proposition  5.1  Consider  a  sequence  {a:*}  generated  by  the  perturbed  damped  Newton’s 
method  (35)  for  problem  (34).  Let  Xk  — ►  x*  such  that  F(x*)  =  0  and  the  standard  assump¬ 
tions  (B1)-(B3)  hold  at  x*. 

(i)  If  ak  — ►  1  and  pjt  =  o(||F(xjfc)||),  then  the  sequence  {x^}  converges  to  x*  Q-superlinearly. 

(ii)  If  ock  =  1  +  0(||F(xfc)||)  and  /i*  =  0(||F(x*)||2),  then  the  sequence  {x*}  converges  to  x* 

Q-quadratically. 

Proof:  Standard  Newton’s  method  analysis  arguments  (see  Dennis  and  Schnabel  (Ref.  21) 
for  example)  can  be  used  to  show  that 


N+i  -  *1  =  (1  -  ak)\\xk  -  *11  +  p*||F'(xfe)-1p||  +  0(\\xk  -  x*||2),  (36) 

and 

||F(xfc)||  =  0(||xfc-xl)  (37) 

for  all  Xk  sufficiently  near  x*.  The  proof  now  follows  by  considering  (36)  and  (37).  □ 

We  are  now  ready  to  establish  convergence  rate  results  for  our  perturbed  damped  interior- 
point  Newton’s  method  for  problem  (34),  i.e.  Algorithm  1.  First  we  introduce  some  notation 
and  make  several  observations.  We  let  w  =  z  and  choose  the  steplength  a*,  given  by  (33). 
Our  algorithm  is  the  perturbed  damped  Newton’s  method  applied  to  the  nonlinear  system 
F(x,  Vi  si  z)  =  0  given  in  (29).  Observe  that  the  conditions  (Al)-(A5)  imply  the  conditions 
(B1)-(B3)  according  to  Proposition  4.1.  In  the  following  presentation  it  will  be  convenient 
to  write 

Hk  =  <rkmin(SkZke) 

and  state  our  conditions  in  terms  of  ak. 

Theorem  5.1  (Convergence  Rate)  Consider  a  sequence  {vfc}  generated  by  Algorithm  1. 
Assume  that  {u*}  converges  to  a  solution  v*  such  that  the  standard  assumptions  (A1)-(A5) 
for  problem  (24)  hold  at  v*. 

(i)  If  r/t  — >  1  and  ak  —■ *  0,  then  the  sequence  {u^}  converges  to  v*  Q-superlinearly. 

(ii)  If  Tk  —  1  +  0(||F(i>fc)||)  and  <Jk  =  0(||F(ufc)||),  then  the  sequence  {ujt}  converges  to  v* 

Q-quadratically. 
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Proof:  The  proof  of  the  theorem  will  follow  directly  from  Proposition  5.1  once  we  establish 
that  ak  satisfies  a  relationship  of  the  form 

ak  =  min(l,  rk  +  0(ak)  +  0(||.F(v*)||)).  (38) 

We  now  turn  our  attention  to  this  task.  Since  An  =  — —  pke),  where  the 
vector  e  =  (0, . . . ,  0, 1, . . . ,  1)  with  p  ones,  we  see  that 


||ASi||  =  OdlFMII)  +  0(W),  (39) 

and 

\\Azk\\  =  0(\\F(vk)\\)  +  0(pk).  (40) 

Hence  both  A sk  and  A zk  converge  to  zero. 

From  linearized  perturbed  complementarity  we  have 

■S'*;-1  As  +  Zk~'Az  =  -e  +  pkSk~l  Zk~le.  (41) 

It  follows  from  strict  complementarity,  (39),  (40),  and  (41)  that  if  i  is  an  index  such  that 
Si*  =  0,  then 

=  -1  +  0(||F  (v*)||)  +  %), 

while  if  it  is  an  index  such  that  [s*],-  >  0,  then 

[As^ 

M,- 

Similar  relationships  hold  for  the  z-variables.  Hence 


min(5*:  1  As,  Zk  aAz)  =  -1  +  0(|jF(t>jfc)||)  +  0(crk). 


So 


ak  =  min(l,  r*/(l  +  <9(||F(t;fc)||)  +  0(0-*))).  (42) 

However,  if  ak  satisfies  a  relationship  of  the  form  (42),  then  it  satisfies  a  relationship  of  the 
form  (38).  □ 


Theorem  5.2  (Local  Convergence)  Consider  problem  (24)  and  a  solution  v*  such  that 
the  standard  assumptions  (A1)-(A5)  hold  at  v*.  Given  f  G  (0, 1)  there  exists  a  neighborhood 
D  of  v*  and  a  constant  a  >  0  such  that  for  any  Vo  G  D  and  any  choice  of  algorithmic 
parameters  r*  €  [f ,  1]  and  ak  €  (0,  a],  Algorithm  1  is  well  defined  and  the  iteration  sequence 
converges  Q-linearly  to  v*. 
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Proof:  We  first  observe  that  the  estimates  constructed  in  the  proof  of  Proposition  5.1  and 
Theorem  5.1  above  do  not  depend  on  the  fact  that  we  assumed  convergence  of  the  iteration 
sequence.  Clearly  they  depend  strongly  on  the  standard  assumptions.  By  using  (36),  (37), 
and  (38)  we  can  derive 

lk+1  -  v\\  <  (1  -Tk  +  0(crk)  +  0(|K  -  v*\\))\\vk  -  v*\\.  (43) 

In  (43)  we  used  the  fact  that 

H  =  <7jtO(||F(t;fe)||)  =  crkO(\\vk  -  v*||). 

The  proof  now  follows  from  (43).  □ 


6  Global  Convergence  Theory 

In  this  section  we  establish  a  global  convergence  theory  for  a  primal-dual  Newton  interior- 
point  algorithm.  The  algorithm  that  we  consider  here  has  the  same  basic  structure  as 
Algorithm  1  with  a  particular  choice  for  the  merit  function  <f>.  The  main  result  is  Theorem  6.1 
which  states  that  any  limit  point  of  the  sequence  generated  by  our  algorithm  is  a  KKT  point 
of  problem  (24). 

We  start  by  recalling  that  the  slack- variable  form  of  the  KKT  conditions  of  problem  (24) 
is 

/  VxL(x,y,s,z)  \ 
h(x) 
g(x)  -  s 
ZSe 


F(x,y,s,z)  = 


\ 


=  0,  (5,2)  >  0, 


/ 


which  can  be  written  as 


f(W,z)S  G<™>  =0,  (s,z)  >  0, 

Zoe  I 


where 


/ 


G(x,y,s,z)  = 


h(x) 

\  g(x)  -s  ) 

As  before  we  will  use  the  following  notation 


VxL(x,y,s,z) 


\ 


(44) 


v  =  (x,y,s,z). 
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At  a  current  point  v  —  ( x,y,s,z )  and  for  a  chosen  steplength  a,  the  subsequent  iterate  is 
calculated  as 

t>(a)  =  (x(a),  y(at),  s(a),  z(a))  =  (x,  y,  s,  z)  +  a(Ai,  Ay,  As,  A z), 

where  (Ax,  Ay,  As,  A z)  is  the  solution  of  the  system 

F'(v)  Av  =  —F(v)  +  y,e.  (45) 

To  specify  the  selection  of  a,  we  first  introduce  some  quantities  and  functions  that  we 
will  make  use  of  later.  For  a  given  starting  point  t>0  =  (x0,  yo,  Zo,  s0)  with  (s0,Zo)  >  0,  let 

Ti  =  min(Z0S'oe)/[(20)T'So/p],  r2  =  (zo)TVI|£(uo)||2- 

Define 

//(a)  =  min(Z(o;)s(Q;))  -  ')Tiz(a)T s(a) / p,  (46) 

and 

fn(a)  =  z(a)Ts(a)  -  7r2||G(u(a))||2,  (47) 

where  7  €  (0, 1)  is  a  constant.  We  note  that  the  functions  /‘(a),  i  —  I,  II,  depend  on  the 
iteration  count  Jc,  though  for  simplicity  we  choose  not  to  explicitly  write  out  this  dependency. 
It  is  also  worth  noting  that 

(i)  for  v  =  v0  and  7  =  1,  /’( 0)  =  0  for  i  =  I,  II\ 

(ii)  / I(a)  is  a  piecewise  quadratic  and  fn(a)  is  generally  nonlinear. 

It  is  known  that  if  ak  are  chosen  such  that  f^ot)  >  0  for  all  a  E  [0,  ajt]  at  every  iteration, 
then  (zk,Sk)  >  0  and 

min (ZkSke)/[(zk)Tsk/p]  >  7*77, 

where  7*  E  (0, 1).  This  is  a  familiar  centrality  condition  for  interior-point  methods. 

Based  on  these  observations,  in  choosing  the  steplength  a k  at  every  iteration,  we  will 
require  ak  to  satisfy  f'(ak)  >  0,  i  =  I,  II,  and  /7(o)  >  0  for  all  a  E  [0,a*]. 

For  i  =  /,  II,  define 

a’  =  max  {a  :  >  0  for  all  a'  <  a},  (48) 

ae[o,ir  -  -  v  ' 

i.e.,  a1  are  either  one  or  the  smallest  positive  root  for  the  functions  /‘(a)  in  (0, 1]  (it  will  be 
shown  later  that  a*  >0).  Since  /7(a)  is  a  piecewise  quadratic,  a1  is  easy  to  find. 
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Our  globalized  algorithm  is  a  perturbed  and  damped  Newton  method  with  a  back¬ 
tracking  linesearch.  The  merit  function  used  for  linesearch  is  the  squared  i2  norm  of  the 
residual,  i.e., 

m  =  ii^ooiia  • 

We  use  the  notation  <f>k  to  denote  the  value  of  the  function  <f>(v)  evaluated  at  w*.  Similar 
notation  will  be  used  for  other  quantities  depending  on  Vk.  Moreover,  we  use  <^(a)  to  denote 
<f>(vk  +  aAufc).  Clearly,  <f>k  =  ^jt(O)  =  <f>(vk). 

It  is  not  difficult  to  obtain  a  condition  under  which  the  perturbed  Newton  step 

An  =  -F'^Vkr'F^Vk), 

gives  descent  for  the  merit  function  <j>(v).  The  derivative  of  at  a  =  0  is 

(V<A)t  An  =  2(F'(v)tF(v))t[F'{v)-1(-F(v)  +  fie)] 

=  2  F(v)T(-F(v)  +  fie) 

=  2(-\\F(v)\\l  +  nF(v)Te), 

hence 

(V<£)tAu  <  0  if  and  only  if  /x  <  \\F(v)\\2l sT z. 

Now  we  describe  the  globalized  primal-dual  Newton  interior-point  algorithm. 


Algorithm  2  (Global  Algorithm) 


Step  0  Choose  v0  =  (x0,  y0,  -s0,  zo)  such  that  (s0,.zo)  >  0,  p  €  (0,1)  and  /3  e  (0,1/2].  Set 
k  =  0,  7 k-i  =  1,  and  compute  <f>0  =  <f>(v0).  For  k  =  0, 1,2, . . .  ,  do 

Step  1  Test  for  convergence:  if  (f>k  <  eex(t,  stop. 

Step  2  Choose  Ok  €  (0,1)  and  for  v  =  Vk  compute  the  perturbed  Newton  direction  Avk 
from  (45)  with 


Hk  =  1 7k 


(sk)TZk 

p 


Step  3  Steplength  selection: 

(3a)  Choose  1/2  <7 *  <  7*;_i,  compute  a’,  i  =  I,  II,  from  (48)  and  let 


a*  =  min(aJ,  a11). 


(49) 
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(3b)  Let  ak  =  ptotk,  where  t  is  the  smallest  nonnegative  integer  such  that  ak  satisfies 

4>k{(*k)  <  <f>k{ 0)  +  akf3<t>'k(0).  (50) 

Step  4  Let  =  vk  +  akAvk  and  k  *—  k  +  1.  Go  to  Step  1. 

The  question  as  to  whether  the  perturbed  Newton  direction  is  a  descent  direction  for  the 
merit  function  <j>  (for  the  choice  of  pk  given  in  Algorithm  2)  is  answered  in  the  affirmative  in 
the  following  proposition. 

Proposition  6.1  The  direction  Avk  generated  by  Algorithm  2  is  a  descent  direction  for  the 
merit  function  <f>(v)  at  vk.  Moreover  if  condition  (50)  is  satisfied,  then 

<t>k(ak)  <  [1  -  2ak/3(l  -  <rk))(/>k(0). 

Proof:  We  will  suppress  the  subscript  k  in  the  proof.  Note  that 

V<f>TAv  =  —2  (4>  —  pzT  s). 

Since  p  =  <rzTs/p  and 

(zV/p  =  (l|ZSe||,/v?)2  <  l\ZSe\\l  <  \\G\\\  +  \\ZSe\\\  =  (51) 

it  follows  that 

V<f>T Av  <  -2(1  -<r)<t>  <  0. 

So  the  perturbed  Newton  direction  indeed  gives  descent.  Moreover  condition  (50)  can  be 
written 

<K«)  <  [1  -  2a/3(l  -  cr)]<^(0). 

This  proves  the  proposition.  □ 

This  proposition  asserts  also  that  the  sequence  {< j>k }  is  monotone  and  non-increasing, 
therefore, 

<t>k  <  <}> o  for  all  k. 

Moreover,  we  have  global  Q-linear  convergence  of  the  values  of  the  merit  function  <f>  to  zero 
if  {^}  is  bounded  away  from  zero,  and  ck  is  bounded  away  from  from  one.  It  is  also  worth 
noting  that  the  above  inequality  is  equivalent  to 

<  [1  -  2akP{l  -  (T*)]1'2. 


II^K)ll2 
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One  problem  that  may  preclude  global  convergence  is  that  the  sequence  {H-ZfcS'fceH}  con¬ 
verges  to  zero,  but  {^(vjt)}  does  not.  The  following  proposition  shows  that  Step  (3a)  in 
Algorithm  2  plays  a  key  role  in  preventing  such  behavior  from  occurring. 

Proposition  6.2  Let  {i;*}  be  generated  by  Algorithm  2.  Then 

t<t>(vk)  <  [(^)T5*]2  <  p<f>{vk), 

where  l  —  [min(l,  0.5t2)/2]2. 


Proof:  We  will  again  suppress  the  subscript  k  in  the  proof.  The  second  inequality  follows 
from  (51).  So  we  only  need  to  prove  the  first  one. 

Since  atk  <  a*,  we  have  /’(a*)  >  0,  i  =  I,  II.  From  (47)  and  the  choice  7*  >  1/2, 

zTs  >  ~(\\ZSe\\2  +  0.5r2||Gj|2)  >  ^ min(l, 0.5t2)||F||2. 


This  completes  the  proof. 

Given  t  >  0,  let  us  define  the  set 


n(e) 


e  <  Hv)  <  <^o, 


min(Z5e)  iq  zTs  r2l 
zTs/p  ~  2  ’  ||C?(u)||2  “  2  J  ‘ 


□ 


(52) 


This  set  will  play  a  pivotal  role  in  establishing  our  global  convergence  theory.  For  this  set, 
the  following  observations  are  in  order. 


(a)  ff(e)  is  a  closed  set. 

(b)  From  the  construction  of  the  algorithm,  in  particular,  7*  >  1/2, 

{vk}  c  n(o). 


(c)  In  fl(e)  where  e  >  0,  zT s  is  bounded  above  and  bounded  away  from  zero. 

(d)  In  fl(e)  where  e  >  0,  all  components  of  ZSe  are  bounded  above  and  bounded  away  from 

zero. 


We  will  establish  global  convergence  of  the  algorithm  under  the  following  assumptions. 

(Cl)  In  the  set  0(0),  the  functions  /(x),  h(x),  and  ^r(x)  are  twice  continuously  differentiable 
and  the  derivative  of  G(v),  given  by  equation  (44),  is  Lipschitz  continuous  with  constant 
L.  Moreover,  the  columns  of  Wh(x)  are  linearly  independent. 
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(C2)  The  iteration  sequence  {#*}  is  bounded.  (This  can  be  ensured  by  enforcing  box 
constraints  —M  <  x  <  M  for  sufficiently  large  M  >  0). 

(C3)  The  matrix  V£L(x,  y,  z)  +  Vg(x)S~1ZVTg(x)  is  invertible  for  v  in  any  compact  subset 
of  fl(0)  where  s  is  bounded  away  from  zero. 

(C4)  Let  7®  be  the  index  set  {i  :  1  <  i  <  p,  liminffs*;],  =  0}.  Then  the  set  of  gradients 
{Vhi(xk), . . .  ,Vhm(xk),  Vgi(xk),  i  €  7°}  is  linearly  independent  for  k  sufficiently  large. 

We  note  that  if  we  have  g(xk)  —  Sk  — *  0  in  the  algorithm,  then  Assumption  (C4)  is 
equivalent  to  the  linear  independence  of  the  gradients  for  active  constraints,  which  is  a 
standard  regularity  assumption  in  constrained  optimization. 

Proposition  6.3  Assume  that  Assumption  (Cl)  holds.  Then  for  v  6  fl(e)  and  x  in  a 
compact  set,  there  exists  a  positive  constant  M\  such  that, 

IMI  <  AMi  + 1|*||). 

Proof:  We  have  VxL(x,y,s,z)  =  V/(z)  +  Vh(x)y  —  Vg(x)z.  Then  by  Assumption  (Cl), 

y  =  lVh(x)TVh(x)]-'Vh(x)T(VxL(x,  y,  s,  z)  -  Vf(x)  +  Vg(x)z). 

This  implies  the  proposition.  □ 

In  the  remaining  part  of  this  section,  we  concentrate  our  effort  on  proving  the  following 
fact:  given  any  e  >  0,  as  long  as  the  iteration  sequence  Vk  generated  by  the  algorithm  satisfies 

vk  e  fi(e),  e  >  0, 

then  the  step  sequence  {An*}  and  the  steplength  sequence  {ojt}  are  uniformly  bounded 
above  and  away  from  zero,  respectively,  in  the  algorithm.  This  fact  implies  the  convergence 
of  the  algorithm. 

Lemma  6.1  If  {u*;}  C  H(c),  then  the  iteration  sequence  {u/t}  is  bounded  above  and  in 
addition  {(•£*:, s*)}  is  component-wise  bounded  away  from  zero. 

Proof:  From  Assumption  (C2),  {a:*}  is  bounded.  By  Proposition  6.3,  it  suffices  to  prove 
that  {(zjb-Sfc)}  is  bounded  above  and  component-wise  away  from  zero. 
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The  boundedness  of  {#*}  in  fl(e)  implies  that  {||<7(£fc)||}  is  bounded  above,  say,  by 
M2  >  0.  Therefore,  if  follows  from  the  definition  of  (52)  and  the  fact  that  { jj2}  is 
monotonically  decreasing  that 

M  <  \\9(xk)  ~  J*j|  +  ||flr(x*)||  <  \fj> o  +  m2. 

This  proves  that  {s*}  is  bounded  above. 

Since  in  fl(e),  the  sequences  {[«*]»[**]»}»  *  =  1,2,...  ,p,  are  all  bounded  away  from  zero. 
Hence  all  components  of  {zk}  are  bounded  away  from  zero  because  {s*}  is  bounded  above. 
Moreover,  {s^}  will  be  bounded  away  from  zero  if  {zjt}  is  bounded  above.  This  will  be  proved 
next  by  contradiction. 

Suppose  that,  if  necessary  considering  a  subsequence,  [z^]t  — >  oo  for  i  in  some  index  set. 
Then  the  boundedness  of  {[z*], •[$*], •}  implies  that  liminffs*],-  =  0  and  the  corresponding  index 
set  is  1°.  Since  ||  V/(a;fc)  +  —  Vp(xfc)z||  is  bounded  in  H(e),  so  is  || Vh(xk)y  —  Vp(:cjfc)z|| 

because  ||V/(a:jt)||  is  bounded.  Since  ||z/t||  — ►  oo, 

HV/t(a:fe)p  -  Vg(xk)z\\ 

Let  w*  be  any  limit  point  of  {(yjt,  Zk)/\\(yk,  z*)||}.  Clearly,  ||u;*||  =  1,  and  the  components  of 
w*  corresponding  to  those  indices  for  which  {[z,]*}  <  +oo,  i.e.,  i  £  7°,  are  zero.  Let  w*  be 
the  vector  consisting  of  the  components  of  w*  but  excluding  those  corresponding  to  i  £  7°. 
So  ||w  ||  =  ||u»*||  =  1.  The  above  relation  implies  that  at  least  for  a  subsequence  of  {x*}, 

[yh(xk),Vgi(xk),i  G  7°]ii>*  ->■  0. 

This,  however,  contradicts  Assumption  (C4).  So  {z^}  is  bounded  above  and  {s*}  is  bounded 
away  from  zero.  □ 

Lemma  6.2  If  {u*}  c  fl(e),  then  {[F'(ufc)]_1}  is  bounded. 

Proof:  For  simplicity,  we  will  suppress  the  arguments  and  subscripts  in  this  proof.  Rear- 
ranging  the  order  of  rows  and  columns  of  F'(v),  we  have  the  following  matrix. 

Z  S  0  0  > 

-7  0  0  VgT 

0  0  0  VAr  ’ 

0  -Vg  -Vh  V2XL  , 
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where 


A  = 


Z  S 
-I  0 


From  Lemma  6.1 


,B  = 


0  0 

0  VgT 


0  VhT  \ 
VA  VIL  )  ' 


/  0  -/  \ 

\  s~x  s~lz  ) 


exists  in  0(e)  and  is  uniformly  bounded.  Furthermore,  by  Assumptions  (Cl),  (C3),  and 
Lemma  6.1  the  matrix 


H  =  (BtA~1B  +  C)  = 


0  S7hT 

Vh  V2XL  +  VgS~x  ZVgT 


is  invertible  and  \\H  1||  is  uniformly  bounded  in  0(e). 
A  straightforward  calculation  shows  that 


(A  1  _  /  A-1  -  A~1BH~1BTA~1  -A~XBH~X  \ 

\-BT  C  )  ~[  H~1BtA~1  H -1  )  ’ 

which  is  bounded  since  every  matrix  involved  is  bounded.  This  implies  that  (F'(n))_1  is 
uniformly  bounded  in  0(e)  and  proves  the  lemma.  □ 

The  following  corollary  follows  directly  from  Lemma  6.2. 


Corollary  6.1  If  {v*.}  c  0(e),  then  the  sequence  of  search  steps  {An*:}  generated  by  Algo¬ 
rithm  2  is  bounded. 


Now  we  prove  that  {o^}  given  by  Step  (3a)  of  Algorithm  2  is  bounded  away  from  zero. 

Lemma  6.3  If  {u*}  C  0(e)  and  {er^}  is  bounded  away  from  zero,  then  {ojt}  is  bounded 
away  from  zero. 


Proof:  Let  us  suppress  the  subscript  k.  Since  a  =  mi^a^a^),  where 


a’  =  max  {a  :  /’(a')  >  0  for  all  a'  <  a),  i  =  I,  II. 

it  suffices  to  show  that  {o’},  i  =  /,  II,  are  bounded  away  from  zero. 

From  the  definition  of  a1  and  f^a),  a 1  is  the  largest  number  in  [0, 1]  such  that 

Zi(a)si(a )  -  7 TXz(a)T s{oc) / p  >0,  a  6  [0,a7],  i  =  1,2,. . .  ,p. 
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Let 


Tji  =  |AzjAsj  —  7T1  AzrA.s|. 

From  the  boundedness  of  Av  (see  Corollary  6.1),  we  have  for  some  positive  constant  M3, 

<  m3. 

A  straightforward  calculation  shows  that  for  a  €  [0, 1] 
z,(a)s,(a)  —  •yTiz(a)Ts(a)/p 

=  (1  -  a)(z,st-  -  7 rx  +  (1  -  7 +  (Az,As,-  -  ^AzrAs)a2 

>  (1  -  7r1)^a  -  |Az,-As,-  -  ^-AztAs|o2 

=  (1  —  7Ti)//a  —  rjia2 

>  (1— 7  T\)pa  —  M3a2. 

From  the  definition  of  a1  (see  (48)),  clearly, 

t  >  (!-7Ti)// 

M3  ' 

Observe  that  /z  =  asTz/p  is  bounded  below  in  0(e)  for  a  bounded  away  from  zero.  Hence 
a1  is  bounded  away  from  zero  in  0(e). 

Now  we  show  that  }  generated  by  Step  2  of  Algorithm  2  is  bounded  away  from  zero. 
By  the  mean-value  theorem  for  vector-valued  functions, 

G(v  -f  aAw)  =  G(v)  +  a  (/0a  G'(v  -}-  taAv)dtJ  Av 

=  G(v)  -(-  aG'(v)  Av  +  a  (fo(G'(v  +  taAv)  —  G'(v))dt )  Av 
=  G(v)(  1  -  a)  +  a  (fJiG'iv  +  taAv)  -  G'(v))dt)  Av. 

Invoking  Lipschitz  continuity  for  the  derivative  of  G(v)  (Assumption  (Cl)),  we  obtain 

II G(v  +  aAv)\\  <  ||G(i>)||(l  -  a)  +  L|| Au||  V. 

Using  the  above  inequality,  we  have 

fn(a)  =  z(a)Ts(a)  -  'yr2\\G(v  +  aAv)|| 

>  zTs(  1  —  a)  -f  zTsc ra  +  (Az)T  Asa2 
-7r2(||G(u)||(l-a)  +  L||At;||2a2) 

=  (zTs  ~  7r2||G!(t;)||)(l  —  a) 

+zTscra  +  [(Az)TAs  —  7r2L||An||2]Q:2 

>  a[zTs<r  -  |(Az)rAs  -  7t2L||Au||2|ci]. 
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Since  {An*}  is  uniformly  bounded,  there  exists  a  constant  M4  >  0  such  that 


|(Az)tAs  —  7r2i/|jAn||2|  <  M4. 


Hence 

fH(a)  >  a(zTscr  —  M4a). 


This  implies  that 


a"> 


zTsa 

~mT' 


Since  {sjzk}  and  {<x*}  are  bounded  away  from  zero  in  Q(e),  then  {a{7}  is  bounded  away 
from  zero.  This  completes  the  proof.  □ 


Theorem  6.1  Let  {u*}  be  generated  by  Algorithm  2  with  eexit  =  0,  and  {<r*}  C  (0,1) 
bounded  away  from  zero  and  one.  Under  Assumptions  (Cl)-(C4),  {F(t>*)}  converges  to 
zero  and  for  any  limit  point  v*  =  (x*,  y*,  z*,  s*)  of  {u*},  x*  is  a  KKT  point  of  problem  (24). 

Proof:  Note  that  {||T'(u*)||}  is  monotone  decreasing,  ;  hence  convergent.  By  contradiction, 
suppose  that  {||-F(u*)||}  does  not  converge  to  zero.  Then  {u*}  C  H(e)  for  some  e  >  0.  If  for 
infinitely  many  iterations,  a*  =  a/t,  then  it  follows  from  the  inequality 

and  Lemma  6.3  that  the  corresponding  subsequence  of  { <j>k }  converges  to  zero  Q-linearly. 
This  gives  a  contradiction.  Now  assume  that  a*  <  a*  for  k  sufficiently  large.  Since  {a*}  is 
bounded  away  from  zero,  then  the  back-tracking  linesearch  used  in  Algorithm  2  produces 

V</>( Vk)T&Vk  _  -2(<f>(vk)  -  fik(zk)Tsk ) 

l|A«*||  ||Au*|| 

see  Ortega  and  Rheinboldt  (Ref.  22)  and  Byrd  and  Nocedal  (Ref.  23).  Since  {An*}  is 
bounded  according  to  Corollary  6.1, 


4>{v k)  -  Hk{zk)Tsk  — ►  0. 

However,  it  follows  from  (51)  that 

4>(vk)  -  nk{zk)Tsk  >  (1  -  crk)4>(vk). 

Therefore,  it  must  hold  that  <j>(vk)  — >  0  because  {cr*}  is  bounded  away  from  one.  This  again 
leads  to  a  contradiction.  So  {||T1(n*)||}  must  converge  to  zero. 

Since  the  KKT  conditions  for  problem  (24),  F(x,y,z,s )  =  0  and  ( z,s )  >  0,  are  satisfied 
by  u*,  clearly  x*  is  a  KKT  point.  □ 
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7  Computational  Experience 


In  this  section  we  report  our  preliminary  numerical  experience  with  Algorithm  2.  The 
numerical  experiments  were  done  on  a  Sun  4/490  Workstation  running  SunOS  Operating 
System  Release  4.1.3  with  64  Megabytes  of  memory.  The  programs  were  written  in  MATLAB 
and  run  under  version  4.1. 

We  implemented  Algorithm  2  with  a  slight  simplification,  i.e.,  we  did  not  enforce  condition 
(47)  in  our  linesearch  in  order  to  avoid  possible  complication  caused  by  the  nonlinear  function 
fn(ot)  in  condition  (47). 

We  chose  the  algorithmic  parameters  for  Algorithm  2  as  follows.  In  Step  2,  we  choose 
ak  =  min(?7i,  7y2s^2;jt),  where  =  0.2  and  772  =  100.  Moreover,  we  used  (3  =  10-4  in  condition 
(50)  of  Step  (3b),  and  set  the  back-tracking  factor  p  to  0.5. 

In  our  implementation,  we  used  a  finite-difference  approximation  to  the  Hessian  of  the 
Lagrangian  function.  The  numerical  experiments  were  performed  on  a  subset  of  the  Hock 
and  Schittkowski’s  test  problems  (Ref.  24  and  25).  For  most  problems,  we  used  the  standard 
starting  points  listed  in  (Ref.  24  and  25).  However,  for  some  problems,  the  standard  starting 
point  are  too  close  to  the  solution  and  we  instead  selected  more  challenging  starting  points. 

The  results  of  our  numerical  experience  are  summarized  in  Table  1.  The  first  and  the 
sixth  columns  give  the  problem  number  as  given  in  (Ref.  24  and  25).  The  n,  m,  and  p 
columns  give  the  dimension  (number  of  variables,  not  including  slack  variables),  the  number 
of  equality  constraints  and  the  number  of  inequality  constraints,  respectively.  The  Iterations 
column  gives  the  number  of  iteration  required  by  Algorithm  2  to  obtain  a  point  that  satisfies 
the  stopping  criterion 


ll*WI|2 


^  ^exit  —  10 


-8 


1  +  INIU 

We  summarize  the  results  of  our  numerical  experimentation  in  the  following  comments 


(i)  The  implemented  algorithm  solved  all  the  problems  tested  to  the  given  tolerance,  except 
for  problems  13  and  23.  For  problem  23  we  had  to  take  different  step  sizes  with 
respect  to  the  s-variables  and  z-variables  in  order  to  converge.  For  problem  13,  where 
regularity  does  not  hold,  we  only  obtained  a  small  decrease  in  the  merit  function.  After 
100  iterations  the  norm  of  the  residual  was  3.21  x  10~2  and  |j<7(x)  —  s||2  was  of  order 
10"8. 


(ii)  The  quadratic  rate  of  convergence  is  observed 
is  satisfied. 


in  problems  where  second  order  sufficiency 
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(iii)  In  the  absence  of  strict  complementarity,  the  algorithm  was  globally  convergent  but 
the  local  convergence  was  slow.  This  observation  is  compatible  with  our  convergence 
theory.  Strict  complementarity  is  needed  only  for  fast  local  convergence. 

8  Concluding  Remarks 

Some  understanding  of  the  relationship  between  the  logarithmic  barrier  function  formula¬ 
tion  and  the  perturbed  Karush-Kuhn-Tucker  conditions  was  presented  in  Sections  2-3.  In 
summary;  the  logarithmic  barrier  function  method  has  an  inherent  flaw  of  ill-conditioning. 
This  conditioning  deficiency  can  be  circumvented  by  introducing  an  auxiliary  variable  and 
writing  the  defining  relationship  for  this  auxiliary  variable  in  a  particularly  nice  manner 
which  can  be  viewed  as  perturbed  complementarity.  The  resulting  system  is  the  perturbed 
KKT  conditions.  This  approach  of  deriving  the  perturbed  KKT  conditions  from  the  KKT 
conditions  of  the  logarithmic  barrier  function  problem  involves  auxiliary  variables  and  a  non¬ 
linear  transformation  and  is  akin  to  Hestenes’  derivation  of  the  multiplier  method  from  the 
penalty  function  method.  Hence  attributing  algorithmic  strengths  resulting  from  the  use  of 
the  perturbed  KKT  conditions  to  the  KKT  conditions  for  the  logarithmic  barrier  function 
is  inappropriate  and  analogous  to  crediting  the  penalty  function  method  for  the  algorithmic 
strengths  of  the  multiplier  method.  In  Section  4  we  presented  a  formulation  of  a  generic 
line-search  primal-dual  interior-point  method  for  the  general  nonlinear  programming  prob¬ 
lem.  The  viability  of  the  formulation  was  demonstrated  in  Sections  5  and  6.  In  Section  5,  we 
established  the  standard  Newton’s  method  local  convergence  and  convergence  rate  results 
for  our  interior-point  formulation.  In  Section  6,  we  devised  a  globalization  strategy  using  the 
^2-norm-residual  merit  function  and  established  a  global  convergence  theory  for  this  strategy. 
Finally,  our  preliminary  numerical  results  obtained  from  the  globalized  algorithm  appear  to 
be  promising. 
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Table  1:  Numerical  results 
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